Statistical Significance of Normalized Global Alignment
نویسندگان
چکیده
The comparison of homologous proteins from different species is a first step toward a function assignment and a reconstruction of the species evolution. Though local alignment is mostly used for this purpose, global alignment is important for constructing multiple alignments or phylogenetic trees. However, statistical significance of global alignments is not completely clear, lacking a specific statistical model to describe alignments or depending on computationally expensive methods like Z-score. Recently we presented a normalized global alignment, defined as the best compromise between global alignment cost and length, and showed that this new technique led to better classification results than Z-score at a much lower computational cost. However, it is necessary to analyze the statistical significance of the normalized global alignment in order to be considered a completely functional algorithm for protein alignment. Experiments with unrelated proteins extracted from the SCOP ASTRAL database showed that normalized global alignment scores can be fitted to a log-normal distribution. This fact, obtained without any theoretical support, can be used to derive statistical significance of normalized global alignments. Results are summarized in a table with fitted parameters for different scoring schemes.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملGlobal Image Alignment With Any Local Match Measure
The paper presents a new algorithm for parametric image alignment which when given any local match measure applies global estimation directly to the lo cal match measure data without rst going through an intermediate step of local ow estimation This algo rithm can be applied to any local match measure such as correlation normalized correlation squared or ab solute brightness di erences statisti...
متن کاملSAD - a normalized structural alignment database: improving sequence-structure alignments
MOTIVATION We present a structural alignment database that is specifically targeted for use in derivation and optimization of sequence-structure alignment algorithms for homology modeling. We have paid attention to ensure that fold-space is properly sampled, that the structures involved in alignments are of significant resolution (better than 2.5 A) and the alignments are accurate and reliable....
متن کاملStatistical significance of ungapped sequence alignments.
Statistical significance of a local sequence alignment depends not only on the similarity score and on the sequence lengths, but also on a length of the alignment. Dependence of the alignment significance on the length of the sequences has been analyzed earlier, and is based on the idea that the longer sequences have more chances to share a local similarity with a bigger score. To the best of o...
متن کاملA Practical Approach to Significance Assessment in Alignment with Gaps
Current numerical methods for assessing the statistical significance of local alignments with gaps are time consuming. Analytical solutions thus far have been limited to specific cases. Here, we present a new line of attack to the problem of statistical significance assessment. We combine this new approach with known properties of the dynamics of the global alignment algorithm and high performa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 21 3 شماره
صفحات -
تاریخ انتشار 2014